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Abstract 

The translocation of macromolecules into the nucleus is a fundamental eukaryotic process, regulating gene expression, cell 
division and differentiation, but which is impaired in a range of significant diseases including cancer and viral infection. The 
import of proteins into the nucleus is generally initiated by a specific, high affinity interaction between nuclear localisation 
signals (NLSs) and nuclear import receptors in the cytoplasm, and terminated through the disassembly of these complexes 
in the nucleus. For classical NLSs (cNLSs), this import is mediated by the importin-a (IMPa) adaptor protein, which in turn 
binds to IMPP to mediate translocation of nuclear cargo across the nuclear envelope. The interaction and disassembly of 
import receptoncargo complexes is reliant on the differential localisation of nucleotide bound Ran across the envelope, 
maintained in its low affinity, GDP-bound form in the cytoplasm, and its high affinity, GTP-bound form in the nucleus. This in 
turn is maintained by the differential localisation of Ran regulating proteins, with RanGAP in the cytoplasm maintaining Ran 
in its GDP-bound = form, and RanGEF (Prp20 in yeast) in the nucleus maintaining Ran in its GTP-bound form. Here, we 
describe the 2.1 A resolution x-ray crystal structure of IMPa in complex with the NLS of Prp20. We observe 1,091 A 2 of 
buried surface area mediated by an extensive array of contacts involving residues on armadillo repeats 2-7, utilising both 
the major and minor NLS binding sites of IMPa to contact bipartite NLS clusters 17 RAKKMSK 23 and 3 KR 4 , respectively. One 
notable feature of the major site is the insertion of Prp20NLS Ala 18 between the P0 and PI NLS sites, noted in only a few 
classical bipartite NLSs. This study provides a detailed account of the binding mechanism enabling Prp20 interaction with 
the nuclear import receptor, and additional new information for the interaction between IMPa and cargo. 
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Introduction 

A distinguishing feature of all eukaryotic cells is the containment 
of their genetic material within a stable and segregated nuclear 
organelle. This in turn requires the active, bidirectional transport 
of proteins and RNA across the nuclear envelope, a central process 
to a range of important cellular events including DNA replication, 
cell differentiation, and diseases including cancer and viral 
replication. In the classical nucleocytoplasmic transport pathway, 
the nuclear import receptor importin-a (IMPa) recognises a 
nuclear localisation signal (NLS) displayed on a cargo protein, and 
this dimer, through interaction with importin-P (IMPP), is docked 
and translocated across the nuclear pore complex through 
interaction with nucleoporins [1-3]. Once in the nucleus, the 
IMPailMPPxargo complex is dissociated by RanGTP, and the 
IMPs are recycled back to the cytoplasm for a further round of 
import [4,5]. 

The initial interaction of the nuclear import pathway between 
IMPa and the nuclear import cargo has been studied by a range of 
structural and functional approaches. IMPa, a large (529 residue) 
highly conserved macromolecule, is composed of 2 structural 



domains, a short basic 10 kDa N-terminal domain that binds 
importin-P (IBB domain), and a 50 kDa armadillo (ARM)-repeat 
NLS binding domain that recognises and binds NLSs of various 
cargo proteins [6,7]. Interaction with the NLS generally occurs at 
the concave face of the ARM domains, at locations that are 
typically driven by the type of NLS; monopartite NLSs (composed 
primarily of a single cluster of positively charged residues) bind at 
the major NLS-binding site, and bipartitite NLSs, comprised of 
two positively charged separated by a 10-12 residue linker, bind 
by spanning both the major and minor sites of IMPa [7-9] . The 
usual nomenclature for describing the interactions between IMPa 
and NLSs [10,1 1] designates residues binding in the minor site as 
PI', P2' etc., and residues binding the major site as PI, P2 etc. 
The consensus sequences correspond to K[K/R]X[K/R] (corre- 
sponding to positions P2-P5; [K/R] represents Lys or Arg, and X 
represents any amino acid) for monopartite and [K/R] [K/R]X 10 _ 
i 2 [K/R]v5 (corresponding to positions Pl'-P2' for the N- 
terminal basic cluster) cNLSs) for bipartite NLSs [6] . 

The directionality of the nuclear transport process is governed 
by the differential localisation of Ran across the nuclear envelope 
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(reviewed in [12-14]). Specifically, the nucleotide bound state of 
Ran results in conformational changes in two surface loops of the 
protein, termed switch I and switch II, which in turn mediates its 
ability to dissociate importinxargo complexes in the GTP-bound 
form [5,15]. This is achieved through the asymmetric distribution 
of Ran regulatory proteins across the nuclear envelope, which 
modulate the nucleotide bound state of Ran; Ran guanidine 
exchange factor (RCC 1/Prp20), predominately localised to the 
nucleus, maintains Ran in a GTP bound form that binds IMPs 
with high affinity and dissociates the complex upon nuclear entry; 
whilst RanGTPase activating protein (RanGAP), cytoplasmically 
located, maintains Ran in a GDP bound conformation. Thus, 
nuclear localisation of Prp20 plays a vital role in establishing the 
directionality of the nuclear localisation process. 

Previous studies have elucidated the region within Prp20 
responsible for its nuclear localisation [16—18], however the 
detailed mechanism of its import remained to be fully determined. 
The NLS region is contained within the N-terminal 25 residues, 
and interacts with IMPa. Here, using x-ray crystallography, we 
elucidate the binding interface between IMPa and the NLS region 
of Prp20, and provide a structural comparison to other known 
IMPaNLS cargo interaction interfaces. 

Materials and Methods 

Expression and Purification 

Mouse IMPaAIBB (residues 70-529) was overexpressed using 
thio-fS-D-galactose (IPTG) method as outlined in [19]. The sample 
was purified using a nickel-nitrilotriacetic acid (Ni-NTA) column 
(GE Healthcare), where cells were lysed by freeze-thawing in the 
presence of 20 mg of lysozyme, and cleared bacterial cellular 
extract injected onto a 5 mL HisTrap HP column (GE 
Healthcare) in His Buffer A (50 mil phosphate buffer, 300 mM 
NaCl, 20 mM imidazole, pH 8; AKTApurifier FPLC (GE 
Healthcare)), washed, and eluted with His Buffer B (50 mM 
phosphate buffer, 300 mM NaCl, 500 mM imidazole, pH 8). 
Peak fractions were pooled and loaded onto a HiLoad 26/60 
Superdex 200 column (GE Healthcare) containing 20 mM Tris 
pH 7.8, 125 mM NaCl, for size exclusion chromatography, where 
peak fractions were collected and added to a GST column loaded 
with GST-tagged Prp20 (S. caevisiw RanGEF) NLS (residues 3-23, 
KRTVATNGDASGAHRAKKMSK 25 ). Prp20 NLS was overex- 
pressed as a GST-fusion protein using the autoinduction method 
as previously described in [6] . GST:Prp20 NLS was injected and 
immobilised on a GSTrap FF column (GE Healthcare), washed 
extensively in binding buffer containing 50 mM Tris pH 7.8, 
125 mM NaCl. Purified IMPaAIBB was then passed over the 
column containing GST:Prp20 NLS, washed, and eluted in 
binding buffer containing 10 mM glutathione. The GST-tag was 
removed by overnight treatment of thrombin at 4°C, and the 
complex purified by a further round of size exclusion chromatog- 
raphy. The complex, in 20 mM Tris pH 7.8, and 125 mM NaCl, 
was then concentrated to 20 mg/mL (Amicon, MWCO 10 kDa, 
Millipore), aliquoted, and flash-frozen in liquid nitrogen and 
stored at -80°C. 

Crystallization and Data Collection 

Single rod shaped crystals, measuring 200x50x50 um, were 
obtained in 500 mM sodium citrate pH 6-8, 10 mM DTT after 
2 d, harvested with a cryoprotectant composed of 80% mother 
liquid and 20% glycerol, and flash frozen with liquid nitrogen. 
Diffraction data was collected at the Australian Synchrotron 
(MX2) Beamline using BLU-ICE software [20]. 180° of diffraction 
data (0.5° oscillations) were integrated, scaled, and converted to 



structure factors using MosFlm [21], Scala [22] and Truncate 
[23]. 

Structure Determination and Refinement 

Diffraction images were integrated and scaled to 2.1 A 
resolution in iMOSFLM, with an Rmerge of 6.7% (data statistics 
are summarized in Table 1). IMPa residues 72-497 from the 
nucleosplasmin NLS complex structure (PDB ID 3UL1) [6] were 
used as the search model for molecular replacement to generate 
phases and an initial electron density map, with the test set 
reflections transferred from the search model dataset. Both rigid 
body and restrained refinement were performed using Refinac. 
Prp20 backbone was built manually through iterative cycles of 
COOT and REFMAC [24,25]. 

Results and Discussion 

For Prp20 to maintain Ran in its nuclear, GTP bound state, it 
must first be translocated to the nucleus. The region within Prp20 
responsible for directing this localisation has been clearly defined 
[16], and shown to reside within residues 1-25. Using amino acid 
substitutions, the NLS was shown to be bipartite, consisting of 
residues KR 4 , a 12-residue spacer, and RAKKMSK 25 . The C- 
terminal cluster does not conform to the conventional NLS 
consensus. To elucidate the structural basis for the interaction 
between the nuclear import receptor IMPa and the NLS region of 
the nucleotide exchange factor Prp20, the domains previously 
demonstrated to mediate this interaction [16] were recombinandy 



Table 1. Crystallographic data. 





Data collection 


Space group 


P2, 2, 2, 


Unit cell dimensions (A) 


0 = 78.91, b = 89.92, c = 99.70 


Resolution range (A) 


36.37-2.10 (2.16-2.10)" 


Total reflections 


294,256 (24,607) a 


Unique observations 


53,834 (3,41 3) a 


Completeness (%) 


100 (100) a 


Multiplicity 


7.0 (7.2) a 


Rmerge (%)" 


0.07 (0.30) a 


Average l/a (1) 


1 6.7 (6.0) a 


Mosaicity 


0.6 


Refinement 


Rcryst/Rfree (%)' 


16.7 (18.9)/20.1(20.4) 


Bond length RMSD (A) 


0.022 


Bond angle RMSD (°) 


2.09 


Average B factor (A 2 ) 


34.80 


Ramachandran plot (%) d 


Favoured 


99 


Outliers 6 


0.23 



a Numbers in parenthesis are for the highest resolution shell. 

(|l hki-i — <' hki >D)/£hkW <l hki>. where I hM is the intensity of 
an individual measurement of the reflection with Miller indices h, k and I, and 
<lhki> is the mean intensity of that reflection. 

c Rcryst = Zi>ki(||F°b5hki|-|Fcalc hk |||)/|Fobs hk ||, where |Fobs hk ,| and |Fcalc hk || are the 

observed and calculated structure factor amplitudes. R free is equivalent to R cryst 

but calculated with reflections (5%) omitted from the refinement process. 

Calculated with the program PROCHECK 

e Asn239 is Ramachandran outlier in all IMPa structures 

doi:1 0.1 371 /joumal.pone.0082038.t001 
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expressed, purified to homogeneity, and isolated as an equimolar 
complex. Recombinant IMPocAIBB (IMPa lacking the N-terminal 
IBB domain, thus preventing autoinhibition of the NLS binding 
site), consisting of 10 consecutive ARM repeat domains, was first 
purified by affinity and size exclusion chromatography, then 
applied to a column loaded with purified GST-tagged Prp20 NLS. 



Excess IMPaAIBB 



was removed through extensive column 



washing, and subsequent elution, affinity tag-cleavage, and size 
exclusion chromatography enabled efficient separation of IM- 
PaAIBB:Prp20 NLS complex (kDa of >50 kDa) from excess 
Prp20 (<5 kDa), thus isolating a homogenous complex of 
equimolar concentrations. 

Large, strongly diffracting crystals were grown in a citrate- 
containing solution, based on conditions as described in the 
Materials and Methods section. Using synchrotron radiation, 
crystals diffracted to 2.1 A resolution and were resistant to 
radiation damage, permitting 180° of data to be collected from 
a single crystal without deterioration in diffraction quality (Fig. 1). 
Following rigid body refinement and simulated annealing, analysis 
of both 2F 0 — F c and simulated annealed omit maps revealed clear 
density in the major and minor NLS-binding sites of IMPa, 
enabling residues of the Prp20 NLS to be built (Fig. 2). 

A final structural model, comprised of IMPa residues 72 to 497, 
bound to Prp20 NLS residues KR 4 and RAKKMSK 23 , and 126 
water molecules, with good stereochemistry has been deposited to 
the PDB (Table 1). Residues 6-15 of Prp20 could not be discerned 
from the electron density, a common observation for bipartite 
NLSs with long linker regions [6] and these residues were omitted 
from the final model. The 425 residues comprising IMPaAIBB are 
structured into 10 ARM repeats, with an overall arrangement 
similar to that of available IMPa structures (e.g. RMSD for the 
equivalent Ca residues from the structures with PDB IDs EJY, 
1EJL, 1PJM are 0.28, 0.29, and 0.30 A, respectively). The 
interaction between IMPa and the Prp20 NLS is made through 
an extensive array of contacts involving residues contained with 
ARM repeats 2 through 7, utilising both the major and minor 
NLS binding sites of IMPa to contact Prp20NLS RAKKMSK 23 and 
the canonical Pr P 20NLS j^j^ 4 mo tif, respectively, and exhibiting a 
total of 1,091 A 2 buried surface area. One notable feature of the 
major site is the insertion of Prp2 ° Ala 18 between the P0 and PI 
NLS positions, noted in only a few classical bipartite NLSs. This 
results in hydrogen-bonding interactions between the 



Prp20NLS Ala 1!! , and the side chains of Trp 231 and Arg 238 of 
IMPaAIBB (Fig 3). The interaction at the PO-binding site is 
mediated by a salt bridge between the guanidinium side chain of 
Prp20NLS Arg 17 and the carboxylate side chain of Asp 270 , as well as 
hydrogen bonding between the main chain of Plp20NLS Arg 17 and 
the side chain of Arg 238 . At the PI -binding site, the main chain of 
Prp20NLS Lys 19 forms hydrogen bonds with the side chain of Asn 235 
[ND2] (Table 2). The prominent P2-binding site displays multiple 
interactions involving a salt bridge between the ammonium side 
chain of Prp20 Lys 20 and the carboxyl side chain of Asp 192 , as well as 
hydrogen bonding between side chain of Pr P 20NLS Lys 20 and the 
oxygen of the side chain of Thr 15 5 and the main chain of Gly 150 . 
At the P3-binding site, hydrogen bonds and hydrophobic 
interactions are observed between the main chain of 
Prp20NLS Met 2 1 and side chains of Trp 184 and Asn 188 . At the PS- 
binding pocket, the main chain of Prp20NLS Lys 23 interacts with the 
side chains of Ser 10 ' 1 , Trp 142 , and Asn 146 , while Prp20NLS Lys 23 
main chain N is hydrogen-bonded with the side chain of Asn 1 . 

The minor site involves hydrogen-bond interactions between 
the side chain of Pr P 2flNLS Lys 3 with the side chain of Thr 328 , and 
main chain of Val 321 and Asn 361 , whilst the P2' -binding site shows 
multiple interactions involving a salt bridge between the 
guanidinium of Pr i> 20NLS Arg 4 and the carboxylate of Glu 396 ; and 
hydrogen bonding between the Prp20NLS Arg 4 side chain and the 
main chain of Ser 360 , and the main chain of Prp20NLS Arg 4 with the 
side chain of Asn 361 . 

The interface observed between IMPa and the Prp20 NLS side 
chain and main chain residues both consolidate known binding 
site information of IMPa, as well as provide additional new 
interactions previously not described [6,9-11,26,27]. The PO- 
binding site, generally comprised of Asn 235 , Arg 238 , and involving 
hydrophobic interactions, in our structure involves the interaction 
with Arg 23 , however the interaction between Asn 2 5 is disrupted, 
and instead replaced by a strong salt interaction between Arg 17 
and Asp 270 . Consistent with our structural observations, mutation 
of Prp20NLS Arg i7 tQ Thr resulted in reduced binding to IMPa [16]. 

The PI -binding pocket generally accommodates a long positively 
charged NLS side chain, despite the fact that the interaction 
generally involves only nonspecific interactions with the main 
chain of the NLS. Indeed, our structure is consistent with this 
general observation, with the Prp20NLS Lys 19 only interacting 
through its main chain with the side chain of Asn 35 . Analysis of 
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Figure 1. Diffraction image (left) and Rmerge statistics across batches (right), demonstrating crystal resistance to radiation damage 
during data collection. 
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Figure 2. Cartoon overview of the IMPa (in ribbon):Prp20 (stick model) complex structure (top), superimposed on the Fo-Fc 
annealed omit map (green; calculated using Phenix [30], contoured at 2.0 a). Figures were produced using PyMOL (DeLano Scientific LLC). 
doi:1 0.1 371 /journal.pone.0082038.g002 
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Figure 3. Structure of the complex between Prp20 NLS (grey sticks) and IMPa (grey cartoon backbone and black sticks), 
highlighting interactions at specific positions. The first two panels highlight the interactions at the minor site (NLS residues K 3 and R 4 ), and the 
remaining binding sites highlight the interactions at the major site (NLS residues RAKKMSK 23 ). Fig ures were produced using PyMOL (DeLano Scientific 
LLC). 

doi:1 0.1 371 /journal.pone.0082038.g003 
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Table 2. NLS bindi 


ng to the 


major 


and 


minor sites of IMPa. 
















NLS 


Minor site 






Linker* 


Major site 








PDB ID 




PI' 


P2 


P3 


P4' 




PI 


P2 


P3 


P4 


PS 




Prp20 


K 


R 


T 


V 


ATNGDASGAHRA 


K 


K 


M 


S 


K 


This study 


Bimaxl 


K 


R 


P 


L 


EWDEDEEPP 


R 


K 


R 


K 


R 


3UKW 


Rb 


K 


R 


S 


A 


EGSNPPKP 


L 


K 


K 


L 


R 


1PJM 


Npl 


K 


R 


P 


A 


ATKKAGQ 


A 


K 


K 


K 


K 


1EE5 


yCBP80 


K 


R 


R 


G 


DFDEDENYRDFRPRM 


P 


K 


R 


Q 


R 


3UKY 


PB2 


K 


R 


D 


S 


SILTDSQJA 


T 


K 


R 


1 


R 


2JDQ 



Italics indicate that the sequence could not be discerned from the electron density and was omitted from the model. References for PDB ID 3UKW [6] 1PJM [9], 1EE5 

[28], 3UKY [6], 2JDQ [29]. 

doi:1 0.1 371 /journal.pone.0082038.t002 



IMPa bound to NLSs has previously revealed P2 as the most 
critical position in the NLS. The binding pocket is predominantly 
comprised of Gly 15 , Thr 155 and Asp 192 on IMPa, and the pocket 
appears best suited for binding a lysine residue; these structural 
observations have been confirmed through site-directed mutagen- 
esis studies, where mutagenesis of the K to A at the P2 position not 
only abolished nuclear localisation of the protein, but also reduced 
the affinity for IMPa ~300 fold. Consistently, substitution of 
Prp 2 0NLS K 20 tQ Thr seyerely crypts p rp2 0 interaction with IMPa 

to approximately 20% of that of the wild-type protein [16]. That 
the larger arginine side chain is less energetically favourable in the 
P2 position has also been demonstrated through a K128R 
substitution in the SV40 TAg cNLS, which resulted in a 
~3 kcal/mol decrease in binding free energy. This high conser- 
vation of P2 lysine is rationalized through the specific and 
extensive hydrogen-bonding interactions with IMPa; the terminal 
nitrogen atom of the lysine side chain coordinates with the main 
chain carbonyl group of Gly 1 '", with the hydroxyl in the side chain 
of Thr 155 , and with the negatively charged side chain of Asp 192 . 
These precise interactions were also observed in our structure. 
Furthermore, Pr P 20NLS Met 2 1 occupies the P3 position in our 
structure, and interacts with the side chains of Trp 184 and Asn 188 . 
This is consistent with nuclear import assays in yeast, which 
showed no defects in Prp20 import when Met 21 was substituted to 
a Thr residue [16]. The P4-binding site exhibits a slight preference 
for arginine, because it is able to make the most favourable 
interactions with ARM repeats 1 and 2; however a greater 
tolerance within this binding pocket has been noted than for the 
P2 position, and consistently, the energy contribution from this 
pocket is ~l/4 of the contribution of the P2 residue. 

The minor site PI' and P2' positions contain the 'KR' motif in 
nearly all IMPaiNLS structures solved to date, with the 
replacement to non-KR residues commonly resulting in cytoplas- 
mic localisation of the protein. The PI '-binding pocket is generally 
defined by residues Thr 328 and Asp' 561 , and whilst a lysine is 
preferred over arginine in this binding cavity (which was observed 
in our structure), because an arginine side chain at this position is 
too long to make optimal interactions with the IMPa side chains, 



arginine can still be accommodated, e.g. in the case of CBP80 [6]. 
The P2' -binding pocket is defined by residues Ser 360 and Glu 396 
within ARM repeats 7 and 8. Conversely to PI', whilst a lysine 
can be accommodated at this position, an arginine side chain is 
able to make more favourable contacts to the IMPa minor binding 
site, and the Prp20 NLS therefore contains the most favoured 
arrangement KR motif at P 1 ' and P2 ' positions, forming both 
specific side chain salt bridges, and main chain hydrogen-bonding 
interactions. Mutations of these residues have shown a weaker 
interaction to IMPa and defects in nuclear localisation in yeast 
cells [16]. 

Overall, our structure defines the binding mechanism of the 
bipartite Prp20 NLS with the nuclear import receptor IMPa. This 
interaction has not been described previously, and importantly, 
provides new structural information relating the mechanism of 
IMPa NLS recognition. The linker region, separating the 
positively charged clusters within the bipartite NLS of Prp20, 
whilst longer than many bipartite NLS linkers, does not perturb 
the ability of these clusters to interact with the major and minor 
sites of IMPa in a manner characteristic of other bipartite NLSs. 
Whilst insertion of a residue between the classical P0 and PI 
disrupted the classical binding observed at P0, this was compen- 
sated by additional binding observed in the close vicinity, and 
highlights both the flexibility of NLS recognition contained within 
IMPa, and difficulty in precisely predicting NLSs. 
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